| Estimator | Space | Time | % | Bia >
THEOREM 3.1. Given MNC sketches $ h_{A} $ and $ h_{B} $ for matrices A and B, the output sparsity $ s_{C} $ of the matrix product C = A B can be exactly computed under the assumptions A1 and A2
Algorithm 1 MNC Sparsity Estimation
Input: MNC sketches $ h_{A} $ and $ h_{B} $ for matrices A and B
Output: Output sparsity $ s_{C} $
1: // a) basic and extended sparsity estimation, incl upper 0 码力 |
26 页 |
613.57 KB
| 2 年前 3 jpg)
## Structured and Unstructured Sparsity
- Lots of 'free' wins from exploring sparsity in modern ML models
- Can often prune models to 80%+ sparsity(with retraining)
- Massive speedups 0 码力 |
11 页 |
3.08 MB
| 1 年前 3 Synthetic Data
■ Generate data with specific data characteristics
■ Systematic evaluation w/ datasize, sparsity, etc distributions?
■ Inappropriate for certain topics: compression, ML accuracy
## “Real” jpg)
[J. Sommer, M. Boehm, A. V. Evfimievski, B. Reinwald, P. J. Haas: MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions. SIGMOD 2019]

[J. Sommer, M. Boehm, A. V. Evfimievski, B. Reinwald, P. J. Haas: MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions. SIGMOD 2019]
 to further improve H. Zhang, H. Zhang, D. Zhao, and W. Liang. Conditional memory via scalable lookup: A new axis of sparsity for large language models. CoRR, abs/2601.07372, 2026. doi: 10.48550/ARXIV.2601. 07372. URL https://doi 0 码力 |
58 页 |
4.27 MB
| 1 月前 3 Learning (Contd.)
• Constrained Clustering
• Distance Metric Learning
• Manifold based Learning
• Sparsity based Learning (Compressed Sensing)
## Constrained Clustering
When we have any of the following: 0 码力 |
57 页 |
2.41 MB
| 2 年前 3 Houston, TX, USA
$ ^{4} $ Target Corporation; Sunnyvale, CA, USA
MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions
Johanna Sommer
IBM Germany
Matthias Boehm
Graz University 0 码力 |
36 页 |
1.12 MB
| 2 年前 3 quantization as described in chapter 2. We could also incorporate compression techniques such as sparsity, k-means clustering, etc. which will be discussed in the later chapters.
2. Even after compression 0 码力 |
53 页 |
3.92 MB
| 2 年前 3 N. Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. CoRR, abs/2101.03961, 2021. URL https://arxiv.org/abs/2101.03961.
L. Gao, S. Biderman, S. Black 0 码力 |
52 页 |
1.23 MB
| 2 年前 3 update::Cint)
Update an LDLt or LLt Factorization F of A to a factorization of $ A \pm C^{*}C^{*} $
If sparsity preserving factorization is used, i.e. $ L^{*}L^{*} $ == $ P^{*}A^{*}P^{*} $ then the new factor sparse matrices differ from their dense counterparts in that the resulting matrix follows the same sparsity pattern as a given sparse matrix S, or that the resulting sparse matrix has density d, i.e. each dimensions m x n with structural zeros at S[I[k], J[k]].
This method can be used to construct the sparsity pattern of the matrix, and is more efficient than using e.g. sparse(I, J, zeros(length(I))).
For 0 码力 |
1681 页 |
5.96 MB
| 2 年前 3
|