| 系統識別號 | U0002-2701202615480000 |
|---|---|
| DOI | 10.6846/tku202600075 |
| 論文名稱(中文) | 基於大腦仿生新奇驅動脈衝更新與語意序列生成之持續學習方法 |
| 論文名稱(英文) | SemaSNN-CL: Brain-Inspired Continual Learning with Novelty-Driven Spiking Updates and Semantic Sequence Generation |
| 第三語言論文名稱 | |
| 校院名稱 | 淡江大學 |
| 系所名稱(中文) | 資訊工程學系博士班 |
| 系所名稱(英文) | Department of Computer Science and Information Engineering |
| 外國學位學校名稱 | |
| 外國學位學院名稱 | |
| 外國學位研究所名稱 | |
| 學年度 | 114 |
| 學期 | 1 |
| 出版年 | 115 |
| 研究生(中文) | 蕭兆翔 |
| 研究生(英文) | Chao-Hsiang Hsiao |
| ORCID | 0009-0008-1985-7545 |
| 學號 | 811410025 |
| 學位類別 | 博士 |
| 語言別 | 英文 |
| 第二語言別 | |
| 口試日期 | 2025-12-26 |
| 論文頁數 | 111頁 |
| 口試委員 |
共同指導教授
-
王銀添(ytwang@mail.tku.edu.tw)
指導教授 - 張志勇(cychang@mail.tku.edu.tw) 口試委員 - 廖文華 口試委員 - 蒯思齊 口試委員 - 石貴平 口試委員 - 楊明豪 |
| 關鍵字(中) |
持續學習 動態路由 新穎性驅動 突觸標記與捕捉 脈衝神經網路 |
| 關鍵字(英) |
Continual Learning Dynamic Routing Novelty-Driven STC SNN |
| 第三語言關鍵字 | |
| 學科別分類 | |
| 中文摘要 |
本論文提出一套名為「SemaSNN-CL」的仿生持續學習框架,旨在解決深度神經網路中面臨的災難性遺忘問題。本研究模擬生物大腦互補學習系統之協同機制,在不依賴舊資料回放的前提下,實現穩健的終身學習。 首先,本研究核心在於模擬海馬迴中由輸入驅動的閘控機制,引入「動態路由」調控策略。本研究提出影像幾何共振機制與語意交互共振機制,依據輸入特徵即時調節訊號傳遞路徑,將高階特徵轉化為具備空間選擇性的閘控訊號,藉此選擇性活化特定神經元以實現精確的模式分離。其次,將神經科學的突觸標記與捕捉(Synaptic Tagging and Capture, STC)假說轉化為演算法。提出「新穎性驅動」更新策略,依據神經元對新資訊的預期誤差定義可量化的新穎性指標,藉此進行分子標記與結構性固化,主動保護關鍵的記憶痕跡,同時保留閒置突觸的可塑性以適應新任務。最後,經實驗證實此仿生機制能有效在記憶穩定性與學習可塑性之間取得平衡,大幅降低跨任務的遺忘率,並藉由脈衝神經網路稀疏運算顯著降低推論能耗。 |
| 英文摘要 |
This thesis proposes SemaSNN-CL, a brain-inspired continual learning framework designed to mitigate catastrophic forgetting in deep neural networks. By emulating the cooperative principles of the Complementary Learning Systems (CLS) in the brain, SemaSNN-CL enables robust lifelong learning without replaying past data. At its core, SemaSNN-CL instantiates hippocampal-style input-driven gating as a dynamic routing mechanism. We introduce geometric resonance (for vision) and semantic interactive resonance (for sequence modeling) to convert high-level representations into spatially selective gating signals that regulate information flow in real time, selectively activating task-relevant neurons and promoting precise pattern separation. In addition, we translate the neuroscience hypothesis of Synaptic Tagging and Capture (STC) into an algorithmic consolidation process. Specifically, we propose a novelty-driven update rule that quantifies novelty via prediction-error–based signals, triggering tagging and structural consolidation to protect critical memory traces while maintaining the plasticity of idle synapses for future adaptation. Extensive experiments demonstrate that SemaSNN-CL achieves a favorable trade-off between memory stability and learning plasticity, substantially reducing forgetting across tasks under replay-free settings. Moreover, leveraging the sparse computation of spiking neural networks (SNNs), the proposed framework significantly reduces inference energy consumption while preserving competitive performance. |
| 第三語言摘要 | |
| 論文目次 |
Table of Contents Acknowledgements i Abstract xi Table of Contents xiii List of Figures xvi List of Tables xvii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Objectives 3 1.3 Research Scope 4 1.4 Contributions 5 1.5 Thesis Organization 7 Chapter 2. Related Work 8 2.1 Catastrophic Forgetting in Continual Learning and Related Approaches 8 2.1.1 Regularization-based Methods 9 2.1.2 Replay and Distillation Strategies 9 2.1.3 Parameter Isolation and Bio-inspired Dynamic Architectures 10 2.1.4 Parameter-Efficient Fine-Tuning and Prompting Strategies 11 2.2 Biological Mechanisms and Biomimetic Learning Theories 12 2.2.1 Hippocampal Synaptic Pathways and Pattern Separation 12 2.2.2 Memory Allocation and Inhibitory Competition 12 2.2.3 Novelty Detection and Synaptic Tagging and Capture 13 2.3 Dynamic Routing and Feature Gating 14 2.4 Spiking Neural Networks and Temporal Filtering Mechanisms 15 2.5 Semantic Representations and Modular Adaptation in Large Language Models 16 2.6 Summary of Related Work 17 Chapter 3. Framework Overview of SemaSNN-CL 18 3.1 Theoretical Core of the SemaSNN-CL Architecture 18 3.2 Memory Engrams and Spatial Partitioning: Pattern Separation 20 3.3 Universal Neural Dynamics: Physical Filtering at the Cell Membrane 21 3.4 Plasticity Modulation: Weight Consolidation and Gradient Blocking 23 3.5 Conclusions 24 Chapter 4. SemaSNN-CL for Continual Image Classification 26 4.1 Hippocampal System Modeling: Pattern Separation and Structural Plasticity 27 4.1.1 Geometric Resonance and Conditional Pattern Separation 28 4.1.2 Dynamic Evolution and Consolidation of Memory Engrams 30 4.1.3 Physical Addressing and De-inhibition 32 4.2 Neocortical Dynamics: Temporal Filtering with PLIF Neurons 34 4.3 STC Control Loop: From Dopamine Signals to Molecular Consolidation 36 4.3.1 Neuronal Pathway: Novelty Detection 37 4.3.2 Synaptic Pathway: Importance Quantification 37 4.3.3 Gating Logic: Saliency-Based Gating 38 4.3.4 State and Actuation: Gradient Masking and Consolidation 39 4.4 Integrated Workflow: A Small Neocortex and a Large Hippocampus 40 4.5 Summary 42 Chapter 5. SemaSNN-CL for Knowledge Learning in LLMs 43 5.1 Dentate Gyrus Simulation: CBG-Based Pattern Separation 44 5.1.1 Limitations of Linear Projection: Feature Superposition 44 5.1.2 Compact Bilinear Interaction: Second-Order Enhancement 45 5.1.3 Rank Projection: Sketch to Low-Rank Driv 47 5.2 Soft-LIF Temporal Inertial Filtering 47 5.2.1 Leaky Integration: Physical Inertia of Semantic Flow 48 5.2.2 Graded Potential: High-Resolution Semantic Encoding 49 5.2.3 Soft Reset: Non-Saturating Sparsity for Parameter Isolation 50 5.3 LoRA Implementation with Neuromodulatory Control 51 5.3.1 Dynamic Rank-Level Gating 52 5.3.2 Orthogonal Subspace Initialization 53 5.3.3 Regularization Objectives 54 Chapter 6. Experiments and Ablation Studies 56 6.1 Experimental Setup and Datasets 56 6.1.1 Image Classification Datasets 56 6.1.2 Language Model Datasets 58 6.1.3 Evaluation Metrics 60 6.2 Vision Experiments: Stability of Microscopic Geometry 62 6.2.1. Vision Experiment I: MNIST 63 6.2.2 Vision Experiment II: CIFAR-100 64 6.2.3 Vision Experiment III: Tiny-ImageNet 66 6.3 Language Experiments: Isolation of Macroscopic Semantics 70 6.3.1 Language Experiment I: MMLU 70 6.3.2 Language Experiment II: Synthetic Keyword QA 73 6.4 Ablation Studies and Mechanism Validation 78 6.4.1 Ablation Study on Visual Models 78 6.4.2 Ablation Study on Language Models 80 6.5 Visualization and Interpretability Analysis 83 6.5.1 Neuronal Gating Signals in the Vision Model 84 6.5.2 Neuronal Gating Signals in the Language Models 88 6.6 Parameter Efficiency and Computational Cost Analysis 93 6.6.1 Definition of Computational Efficiency Metrics 93 6.6.2 Performance Cost Analysis of the SemaSNN-CL Visual Model 94 6.6.3 Performance Cost Analysis of the SemaSNN-CL Language Model 96 6.7 Summary 97 Chapter 7. Conclusion and Future Work 100 7.1 Conclusions 100 7.2 Future Work 102 References 104 List of Figures Fig 1. 1 Schematic of hippocampal memory formation and pattern separation 2 Fig 3. 1 Overall Architecture of SemaSNN-CL 18 Fig 4. 1 SemaSNN-CL for Image Classification Architecture 27 Fig 4. 2 SemaSNN-CL for Resonance and Pattern Separation in Visual Perception 29 Fig 4. 3 Normalized Geometric Drift Strategy 31 Fig 4. 4 Physical Addressing in the Hippocampal Module 32 Fig 4. 5 Detailed computational flow of a PLIF neuron 34 Fig 4. 6 Detailed workflow of the STC control loop 37 Fig 4. 7 Overall visual architecture of SemaSNN-CL 40 Fig 5. 1 Overall architecture of SemaSNN-CL for language models 44 Fig 5. 2 Trainable CBG Gate 45 Fig 5. 3 Soft-LIF Node Architecture 48 Fig 5. 4 Detailed architecture of SemaSNN-CL for language models 52 Fig 6. 1 Partitioning of the MNIST handwritten digit dataset 57 Fig 6. 2 Partitioning of the CIFAR-100 dataset 57 Fig 6. 3 Partitioning of the Tiny-ImageNet dataset 58 Fig 6. 4 Format of the MMLU dataset 58 Fig 6. 5 Format of the synthetic keyword question-answering dataset 60 Fig 6. 6 Average accuracy on MNIST (10 tasks) 63 Fig 6. 7 Average accuracy on Cifar100 (20 tasks) 65 Fig 6. 8 Average accuracy on Tiny-ImageNet (40 tasks) 67 Fig 6. 9 Task-wise accuracy heatmap across all stages on Tiny-ImageNet (40 tasks) 69 Fig 6. 10 Average accuracy on MMLU (100×10 tasks) 72 Fig 6. 11 Average accuracy on MMLU (10×57 tasks) 72 Fig 6. 12 Average accuracy on Synthetic Keyword QA (10×10 tasks) 74 Fig 6. 13 Task-wise accuracy heatmap across all stages on SKQA 10×10-tasks 75 Fig 6. 14 Average accuracy on Synthetic Keyword QA (1×100 tasks) 76 Fig 6. 15 Task-wise accuracy heatmap across all stages on SKQA 1×100-tasks 77 Fig 6. 16 Average accuracy curves of the SemaSNN-CL visual ablation experiments 78 Fig 6. 17 Average accuracy curves of the SemaSNN-CL language ablation experiments 80 Fig 6. 18 Average gating signals of the first 15 Tiny-ImageNet classes 84 Fig 6. 19 Gating signals of 5 images per class in Tiny-ImageNet Task 1 85 Fig 6. 20 t-SNE visualization of gating signals for 40 Tiny-ImageNet classes 87 Fig 6. 21 Average low-rank gating heatmap across five MMLU domains 89 Fig 6. 22 Low-rank gating heatmap across five MMLU domains (100 samples each) 90 Fig 6. 23 t-SNE visualization of gating signals across five MMLU domains 91 Fig 6. 24 Computational efficiency of SemaSNN-CL versus ANN on Tiny-ImageNet 95 Fig 6. 25 Performance and cost comparison between SemaSNN-CL and Standard LoRA 96 List of Tables Table 6. 1 Overall performance summary of the MNIST 10-tasks vision experiment 63 Table 6. 2 Overall performance summary of the CIFAR-100 20-tasks vision experiment 65 Table 6. 3 Overall performance summary of the Tiny-ImageNet 40-tasks vision experiment 67 Table 6. 4 Overall performance summary of the MMLU 100×10-tasks language experiment 72 Table 6. 5 Overall performance summary of the MMLU 10×57-tasks language experiment 73 Table 6. 6 Overall performance summary of the SKQA 10×10-tasks language experiment 74 Table 6. 7 Overall performance summary of the SKQA 1×100-tasks language experiment 76 Table 6. 8 Summary of metrics for the SemaSNN-CL visual ablation study 79 Table 6. 9 Summary of metrics for the SemaSNN-CL language ablation study 81 |
| 參考文獻 |
References [1] Zhao WX, et al. A survey of large language models. arXiv preprint arXiv:230318223, (2023). [2] Qu H, Rahmani H, Xu L, Williams B, Liu J. Recent advances of continual learning in computer vision: An overview. IET Computer Vision 19, e70013 (2025). [3] Y. Luo, Z. Yang, F. Meng, Y. Li, J. Zhou, and Y. Zhang, "An empirical study of catastrophic forgetting in large language models during continual fine-tuning," arXiv preprint arXiv:2308.08747, 2023. [4] M. Mermillod, A. Bugaiska, and P. Bonin, "The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects," Front. Psychol., vol. 4, p. 504, 2013. [5] E. J. Hu et al., "LoRA: Low-rank adaptation of large language models," in Proc. Int. Conf. Learn. Represent. (ICLR), 2022. [6] G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Netw., vol. 113, pp. 54–71, 2019. [7] Treves and E. T. Rolls, "Computational analysis of the role of the hippocampus in memory," Hippocampus, vol. 4, no. 3, pp. 374–391, Jun. 1994. [8] J. K. Leutgeb, S. Leutgeb, M.-B. Moser, and E. I. Moser, "Pattern separation in the dentate gyrus and CA3 of the hippocampus," Science, vol. 315, no. 5814, pp. 961–966, Feb. 2007. [9] T. J. McHugh et al., "Dentate gyrus NMDA receptors mediate rapid pattern separation in the hippocampal network," Science, vol. 317, no. 5834, pp. 94–99, Jul. 2007. [10] W. Maass, "Networks of spiking neurons: The third generation of neural network models," Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997. [11] M. Rabinovich, R. Huerta, and G. Laurent, "Transient dynamics for neural processing," Science, vol. 321, no. 5885, pp. 48–50, 2008. [12] (mnist)Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. [13] (cifar100)A. Krizhevsky, "Learning multiple layers of features from tiny images," Univ. of Toronto, Toronto, ON, Canada, Tech. Rep., 2009. [14] (tinyimagenet)J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255. [15] (mmlu)D. Hendrycks et al., "Measuring massive multitask language understanding," in Proc. Int. Conf. Learn. Represent. (ICLR), 2021. [16] (PLIF)W. Fang, Z. Yu, Y. Yan, D. Masquelier, T. Huang, and Y. Tian, "Incorporating learnable membrane time constant to enhance learning of spiking neural networks," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 2661–2671. [17] (STC)U. Frey and R. G. M. Morris, "Synaptic tagging and long-term potentiation," Nature, vol. 385, no. 6616, pp. 533–536, 1997. [18] J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly, "Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory," Psychol. Rev., vol. 102, no. 3, pp. 419–457, Jul. 1995. [19] J. Kirkpatrick et al., "Overcoming catastrophic forgetting in neural networks," Proc. Nat. Acad. Sci. U.S.A., vol. 114, no. 13, pp. 3521–3526, Mar. 2017. [20] F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proc. 34th Int. Conf. Mach. Learn. (ICML), 2017, pp. 3987–3995. [21] S. Fusi and L. F. Abbott, "Limits on the memory storage capacity of bounded synapses," Nature Neurosci., vol. 10, no. 4, pp. 485–493, Apr. 2007. [22] D. Lopez-Paz and M. A. Ranzato, "Gradient episodic memory for continual learning," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017. [23] G. M. van de Ven, H. T. Siegelmann, and A. S. Tolias, "Brain-inspired replay for continual learning with artificial neural networks," Nature Commun., vol. 11, no. 1, Art. no. 4069, Aug. 2020. [24] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, "iCaRL: Incremental classifier and representation learning," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 2001–2010. [25] A. Mallya and S. Lazebnik, "PackNet: Adding multiple tasks to a single network by iterative pruning," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7765–7773. [26] A. Mallya, D. Davis, and S. Lazebnik, "Piggyback: Adapting a single network to multiple tasks by learning to mask weights," in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 67–82. [27] F.-Y. Wang, D.-W. Zhou, H.-L. Ye, and Z.-H. Zhan, "FOSTER: Feature boosting and compression for class-incremental learning," in Proc. Eur. Conf. Comput. Vis. (ECCV), 2022, pp. 398–414. [28] N. Y. Masse, G. D. Grant, and D. J. Freedman, "Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization," Proc. Nat. Acad. Sci. U.S.A., vol. 115, no. 44, pp. E10467–E10475, Oct. 2018. [29] Q. Shi et al., "Hybrid neural networks for continual learning inspired by corticohippocampal circuits," Nature Commun., vol. 16, no. 1, Art. no. 745, 2025. [30] X. Wang et al., "Orthogonal subspace learning for language model continual learning," in Findings of the Assoc. Comput. Linguist.: EMNLP 2023, Dec. 2023, pp. 245–258. [31] S. Dou et al., "LoRAMoE: Alleviating world knowledge forgetting in large language models via moe-style plugin," arXiv preprint arXiv:2312.09979, 2023. [32] T. Brown et al., "Language models are few-shot learners," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, 2020, pp. 1877–1901. [33] Z. Wang et al., "Learning to prompt for continual learning," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 139–149. [34] P. Lewis et al., "Retrieval-augmented generation for knowledge-intensive NLP tasks," in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, 2020, pp. 9459–9474. [35] D. Marr, "Simple memory: a theory for archicortex," Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, vol. 262, no. 841, pp. 23–81, 1971. [36] J. C. R. Whittington et al., "The Tolman-Eichenbaum Machine: Unifying space and relational memory for generalisation," Cell, vol. 183, no. 5, pp. 1249–1263, 2020. [37] X. Liu, S. Ramirez, P. T. Pang, C. B. Puryear, A. Govindarajan, K. M. Deisseroth, and S. Tonegawa, "Optogenetic stimulation of a hippocampal engram activates fear memory recall," Nature, vol. 484, no. 7394, pp. 381–385, 2012. [38] S. A. Josselyn and S. Tonegawa, "Memory engrams: Recalling the past and imagining the future," Science, vol. 367, no. 6473, Art. no. eaaw4325, 2020. [39] T. P. Vogels, H. Sprekeler, F. Zenke, C. Clopath, and W. Gerstner, "Inhibitory plasticity balances excitation and inhibition in sensory pathways and memory networks," Science, vol. 334, no. 6062, pp. 1569–1573, 2011. [40] S. Sadeh and C. Clopath, "Excitatory–inhibitory balance in neuronal networks: from mechanisms to functions," Nature Reviews Neuroscience, vol. 22, no. 1, pp. 21–37, 2021. [41] J. E. Lisman and A. A. Grace, "The hippocampal-VTA loop: controlling the entry of information into long-term memory," Neuron, vol. 46, no. 5, pp. 703–713, 2005. [42] A. J. Duszkiewicz, C. G. McNamara, T. Takeuchi, and L. Genzel, "Novelty and dopaminergic modulation of memory persistence: a tale of two systems," Trends in Neurosciences, vol. 47, no. 2, pp. 102–114, 2024. [43] Y. Li et al., "Dendritic computing in spiking neural networks for synaptic tagging and capture," Nature Communications, vol. 14, no. 1, Art. no. 7945, 2023. [44] B. A. Olshausen and D. J. Field, "Emergence of simple-cell receptive field properties by learning a sparse code for natural images," Nature, vol. 381, no. 6583, pp. 607–609, 1996. [45] S. Grossberg, "Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world," Neural Networks, vol. 37, pp. 1–47, 2013. [46] J. Frankle and M. Carbin, "The lottery ticket hypothesis: Finding sparse, trainable neural networks," in International Conference on Learning Representations (ICLR), 2019. [47] S. Sabour, N. Frosst, and G. E. Hinton, "Dynamic routing between capsules," in Advances in Neural Information Processing Systems (NeurIPS), vol. 30, pp. 3856–3866, 2017. [48] N. Shazeer et al., "Outrageously large neural networks: The sparsely-gated mixture-of-experts layer," in International Conference on Learning Representations (ICLR), 2017. [49] C. Riquelme et al., "Scaling vision with sparse mixture of experts," in Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 8583–8595, 2021. [50] Y. Gao, O. Beijbom, N. Zhang, and T. Darrell, "Compact bilinear pooling," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 317–326. [51] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski, Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition. Cambridge, U.K.: Cambridge University Press, 2014. [52] W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, "Deep residual learning in spiking neural networks," in Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 21056–21069, 2021. [53] M. Horowitz, "1.1 Computing's energy problem (and what we can do about it)," in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014, pp. 10–14. [54] Z. Zhou, Y. Zhu, C. He, Y. Wang, S. Yan, Y. Tian, and L. Yuan, "Spikformer: A spiking transformer," in International Conference on Learning Representations (ICLR), 2024. [55] R.-J. Zhu, Q. Zhao, G. Li, and J. K. Eshraghian, "SpikeGPT: Generative pre-trained language model with spiking neural networks," arXiv preprint arXiv:2302.13939, 2023. [56] A. Vaswani et al., "Attention is all you need," in Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017. [57] R. Bommasani et al., "On the opportunities and risks of foundation models," arXiv preprint arXiv:2108.07258, 2021. [58] A. Yang et al., "Qwen2.5 technical report," arXiv preprint arXiv:2412.15115, 2024. [59] A. Dubey et al., "The Llama 3 herd of models," arXiv preprint arXiv:2407.21783, 2024. [60] A. Aghajanyan, L. Zettlemoyer, and S. Gupta, "Intrinsic dimensionality explains the effectiveness of language model fine-tuning," in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021, pp. 7319–7328. [61] J. Pfeiffer et al., "Modular deep learning," arXiv preprint arXiv:2302.11529, 2023. [62] D. Kumaran, D. Hassabis, and J. L. McClelland, "What learning systems do intelligent agents need? Complementary learning systems theory updated," Trends Cogn. Sci., vol. 20, no. 7, pp. 512–534, 2016. [63] H. K. Hartline and F. Ratliff, "Inhibitory interaction of receptor units in the eye of Limulus," J. Gen. Physiol., vol. 40, no. 3, pp. 357–376, Jan. 1957. [64] A. L. Hodgkin and A. F. Huxley, "A quantitative description of membrane current and its application to conduction and excitation in nerve," J. Physiol., vol. 117, no. 4, pp. 500–544, 1952. [65] W. Rall, "Distinguishing theoretical synaptic potentials computed for different soma-dendritic distributions of synaptic input," J. Neurophysiol., vol. 30, no. 5, pp. 1138–1168, 1967. [66] R. C. O'Reilly and K. A. Norman, "Hippocampal and neocortical contributions to memory: Advances in the complementary learning systems framework," Trends Cogn. Sci., vol. 6, no. 12, pp. 505–510, 2002. [67] J. J. Knierim and J. P. Neunuebel, "Tracking the flow of hippocampal computation: Pattern separation, pattern completion, and attractor dynamics," Neurobiol. Learn. Mem., vol. 129, pp. 38–49, 2016. [68] D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," J. Physiol., vol. 160, no. 1, pp. 106–154, 1962. [69] E. M. Izhikevich, "Which model to use for cortical spiking neurons?" IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1063–1070, 2004. [70] S. Grossberg, "Contour enhancement, short-term memory, and constancies in reverberating neural networks," Stud. Appl. Math., vol. 52, no. 3, pp. 213–257, 1973. [71] V. F. Castellucci and E. R. Kandel, "A quantal analysis of the synaptic depression underlying habituation of the gill-withdrawal reflex in Aplysia," Proc. Nat. Acad. Sci. U.S.A., vol. 71, no. 12, pp. 5004–5008, 1974. [72] S. Marsland, J. Shapiro, and U. Nehmzow, "A self-organising network that grows when required," Neural Netw., vol. 15, no. 8–9, pp. 1041–1058, 2002. [73] W. Fang, Z. Yu, Y. Yan, D. Masquelier, T. Huang, and Y. Tian, "Incorporating learnable membrane time constant to enhance learning of spiking neural networks," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 2661–2671. [74] M. Minsky and S. Papert, Perceptrons: An Introduction to Computational Geometry. Cambridge, MA, USA: MIT Press, 1969. [75] M. E. Hasselmo, "Neuromodulation: Acetylcholine and memory consolidation," Trends Cogn. Sci., vol. 3, no. 9, pp. 351–359, 1999. [76] H. Bailey, M. Giustetto, Y. Y. Huang, D. O. Hawkins, and E. R. Kandel, "Is heterosynaptic modulation essential for stabilizing massive synaptic strengthening?" Nat. Rev. Neurosci., vol. 1, no. 1, pp. 11–20, 2000. [77] Liao, N. A. Hessler, and R. Malinow, "Activation of postsynaptically silent synapses during pairing-induced LTP in CA1 region of hippocampal slice," Nature, vol. 375, no. 6530, pp. 400–404, 1995. [78] G. M. van de Ven and A. S. Tolias, "Three scenarios for continual learning," arXiv preprint arXiv:1904.07734, 2019. [79] Radford et al., "Learning transferable visual models from natural language supervision," in Proc. Int. Conf. Mach. Learn. (ICML), 2021, pp. 8748–8763. [80] R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, "Green AI," Commun. ACM, vol. 63, no. 12, pp. 54–63, 2020. |
| 論文全文使用權限 |
如有問題,歡迎洽詢!
圖書館數位資訊組 (02)2621-5656 轉 2487 或 來信